A De Novo Transcriptome Analysis Identifies Cold-Responsive Genes in the Seeds of Taxillus chinensis (DC.) Danser

Taxillus chinensis (DC.) Danser, a parasitic plant of the Loranthaceae family, grows by attacking other plants. It has a long history of being used in Chinese medicine to treat multiple chronic diseases. We previously observed that T. chinensis seeds are sensitive to cold. In this study, we performed transcriptome sequencing for T. chinensis seeds treated by cold (0°C) for 0 h, 12 h, 24 h, and 36 h. TRINITY assembled 257,870 transcripts from 223,512 genes. The GC content and N50 were calculated as 42.29% and 1,368, respectively. Then, we identified 42,183 CDSs and 35,268 likely proteins in the assembled transcriptome, which contained 1,622 signal peptides and 6,795 transmembrane domains. Next, we identified 17,217 genes (FPKM > 5) and 2,333 differentially expressed genes (DEGs) in T. chinensis seeds under cold stress. The MAPK pathway, as an early cold response, was significantly enriched by the DEGs in the T. chinensis seeds after 24 h of cold treatment. Known cold-responsive genes encoding abscisic acid-associated, aquaporin, C-repeat binding factor (CBF), cold-regulated protein, heat shock protein, protein kinase, ribosomal protein, transcription factor (TF), zinc finger protein, and ubiquitin were deregulated in the T. chinensis seeds under cold stress. Notably, the upregulation of CBF gene might be the consequences of the downregulation of MYB and GATA TFs. Additionally, we identified that genes encoding CDC20, YLS9, EXORDIUM, and AUX1 and wound-responsive family protein might be related to novel mechanisms of T. chinensis seeds exposed to cold. This study is first to report the differential transcriptional induction in T. chinensis seeds under cold stress. It will improve our understanding of parasitic plants in response to cold and provide a valuable resource for future studies.


Introduction
Taxillus chinensis (DC.) Danser, also named "Sang Ji Sheng," is a parasitic plant from the Loranthaceae family and grows by attacking other plants, such as Aceraceae, Anacardiaceae, Euphorbiaceae, Fabaceae, Fagaceae, Juglandaceae, Moraceae, Rosaceae, and Rutaceae [1]. Because the leaves and stems of T. chinensis have been used to treat angina pectoris, arrhythmia, hypertension, rheumatism, stroke, and threatened abortion [1], it has a long history of being used in the Chinese medicine. It is mainly distributed in the southern and southwestern areas of China, probably due to the warm and humid climate. Our knowledge of T. chinensis is very limited, especially its responses to abiotic stresses.
Transcriptome sequencing, a next-generation sequencing technology, is an efficient method to detect gene expression profiles and elucidate the breadth of molecular mechanisms involved in many physiological processes [16]. It has been widely used to identify key genes and factors involved in the responses to abiotic/biotic stresses in plants due to its advantages in the large-scale functional assignment of genes, thorough qualitative and quantitative analyses of gene expression, improved sensitivity, and accurate profiling of eukaryotic transcriptomes for both model and nonmodel organisms [17,18]. Importantly, it facilitates analyses of gene expression in organisms whose genomes are not accessible. For example, using transcriptome sequencing and de novo assembly analysis, Liu [21]. Similar to these studies, transcriptome sequencing technology will enable us to study changes in gene expression changes in T. chinensis seeds under cold stress. Previously, our lab identified genes that are expressed in response to water loss in T. chinensis seeds [22], such as RD22, HSP, and various TFs (MYB, WRKY, and ethyleneresponsive transcription factors), and reported the regulatory miRNAs in T. chinensis seeds in response to cold [23], such as miR408, miR393b, miR946, ath-miR779.2, miR398, and miR9662. Interestingly, ICE3, IAA13, and multiple TFs (e.g., WRKY and CRF4 and TCP4) were shown to be targets of the dysregulated miRNAs identified in the T. chinensis seeds under cold stress [23]. In the present study, we used the same material as the miRNA study and investigated changes in gene expression changes in T. chinensis seeds in response to cold using transcriptome sequencing technology. This study is the first to analyze the changes in gene expression in T. chinensis seeds under cold stress, and our results will improve our understanding of molecular mechanisms of cold stress in T. chinensis seeds.

Sample Collection and Cold
Treatment. The original seeds of T. chinensis were obtained from three mulberry trees in the wild, and no permissions were required to collect these samples. The seeds were then confirmed by a senior botanist and deposited in the herbarium of Guangxi Botanical Garden of Medicinal Plants (acc. S0001794). For the cold treatment experiment, we selected 300 T. chinensis seeds with similar appearances, including sizes, weights, and health conditions. We observed that the T. chinensis seeds were sensitive to temperature and that 0°C was a suitable temperature to study the cold-responsive genes [23]. The seeds were divided into three groups-no treatment (A0) and cold treatment for 12 h (A1), 24 h (A2), and 36 h (A3). Then, we examined the viability of A0, A1, A2, and A3 by immersing the seeds in a solution of 1% (w/v) 2,3,5triphenyl tetrazolium chloride (TTC, Sigma), as previously described [22].

Total RNA Isolation, cDNA Library Construction, and
Deep Sequencing. Total RNA was extracted from the T. chinensis seeds (100 mg) exposed to cold for 0 h (A0), 12 h (A1), 24 h (A2), and 36 h (A3) using TRIzol reagent, as previously described [22]. Next, an Agilent 2100 Bioanalyzer (Agilent Technologies) was used to determine the quantity and quality of the total RNA. Equal amounts of total RNA (1 μg) were used to construct the cDNA libraries for transcriptome sequencing. Briefly, mRNAs were enriched with magnetic oligo (dT) beads and fragmented into~200 bp fragments, followed by the double strand cDNA library construction using random hexamer N6 primers. Next, the double strand cDNA libraries were end repaired by adding a phosphate at the 5 ′ -end and sticky "A" to the 3 ′ -end. After the 9 BioMed Research International sequencing primers were ligated to each library and multiple libraries were pooled using the index technology, the pooled library was sequenced on the BGISEQ-500 RS platform with a paired-end 150 (PE150) strategy at BGI-Shenzhen.

Read Cleaning and De Novo Assembly of the
Transcriptome. Raw reads were processed using SOAPnuke to remove the low-quality reads, reads with adaptors, and contaminating reads [24]. The obtained clean reads were quality controlled using FASTQC, as previously described [25]. Then, the clean reads of all samples assembled into the transcriptome using the TRINITY software with default parameters, according to a published protocol [26]. We next aligned all the clean reads to the assembled transcriptome using Bowtie2 and determined the global gene expression profile using the RSEM (RNA-Seq by Expectation-Maximization) method [27,28]. The fragments per million reads per kilo base mapped (FPKM) method was used to normalize gene expression. We filtered genes expressed at low levels (FPKM < 1) from the assembled transcriptome for quality control of the assembled transcriptome. CD-HIT was used to cluster the assembled T. chinensis genes [29]. BUSCO was used to evaluate the completeness of the assembled transcriptome [30].

Transcriptome Annotation.
We annotated the assembled T. chinensis transcriptome by mapping it to public databases. In detail, the transcriptome was aligned to the NT database using BLASTn and aligned to the COG, KEGG, NR, and SwissProt databases using BLASTx. Then, open reading frames (ORFs) were predicted in the assembled transcriptome and aligned to the InterPro database using InterProScan. Gene Ontology annotations were retrieved from the mapping results of InterPro and NR. KEGG pathway annotation results were retrieved from the KEGG pathway mapping results. In combination with the BLAST results, we fetched the possible coding sequence (CDS) regions in the assembled transcriptome. Likely proteins encoded by the assembled transcriptome were predicted by TransDecoder and were used to identify signal peptides and transmembrane helices with SignalP (v5.0b) and TMHMM (v2.0c), respectively, according to the manufacturers' protocols.

Gene Expression Profile and Differential Expression
Analysis. We aligned the clean reads of each sample to the assembled transcriptome using Bowtie2 to profile gene expression in the T. chinensis seeds under cold stress [27]. Then, the RSEM method was used to count the reads aligned to each gene and normalize the gene expression using the FPKM method [28]. We used edgeR to calculate the p values and false discovery rate (FDR) to identify differentially expressed genes in the T. chinensis seeds under cold stress. Some cut-offs were used to identify the DEGs, including log2 fold change ðlog 2FCÞ > 1 or < -1, p value < 0.05, coefficient of variation ðCVÞ < 0:7, and false discovery rate ðFDR Þ < 0:1.
2.6. Functional Analysis. Using the GO and KEGG pathway annotations for the assembled T. chinensis seed transcriptome, we analyzed the enriched GO terms (biological processes, molecular functions, and cellular components) and KEGG pathways of the DEGs. Initially, p values were calculated using the Fisher's exact test to show the significance, and q values were calculated using the R package "qvalue" for quality control of the p values.
2.7. qRT-PCR. We selected 11 genes for qRT-PCR validation, and 18S rRNA was used as an internal control. Forward and reverse primers were predicted using Primer3 and synthesized at BGI (Shenzhen) ( Table S1). The procedure for the qRT-PCR experiment was the same as that in our previous study [22]. Then, we calculated the ΔCt value to determine the expression levels of target genes in a sample and ΔΔCt value to assess the difference in gene expression between two samples. We used relative normalized expression (RNE) to show the gene expression changes: RNE = 2 −ΔΔCt and calculated its log2 value (log2RNE) to compare the changes in gene expression identified using RNA-Seq and qRT-PCR.

Overview of Deep Sequencing Results and De Novo
Assembly. We previously observed that T. chinensis seeds were sensitive to temperature and that 0°C was a suitable temperature to study the cold responsive genes in T. chinensis seeds [23]. Thus, we performed paired-end transcriptome sequencing in triplicate for T. chinensis seeds exposed to 0°C for 0 h (A0), 12 h (A1), 24 h (A2), and 36 h (A3). Initially, we obtained 728.82 million raw reads and 725.59 million clean reads after data cleaning. Then, the clean reads of all samples were merged and used for the de novo assembly and analysis. TRINITY assembled 257,870 transcripts derived from 223,512 genes. After the expression levels of the assembled genes were estimated, we filtered genes at low levels (FPKM < 1) from the assem-bled transcriptome and clustered the genes. Finally, we obtained 111,390 transcripts corresponding to 98,514 T. chinensis genes. The statistical values of the assembled transcriptome are shown in Table 1. The GC content and N50, a statistical measure of the average length of a sequence set, were calculated as 42.29% and 1,368, respectively, for the assembled T. chinensis transcriptome. Next, we evaluated the length distribution of the assembled T. chinensis transcriptome. As shown in Figure 1(a), 39,302 transcripts (39.89%) with lengths between 200 and 300 nt and 3,341 transcripts (1.25%) longer than 4,000 nt were obtained. The completeness of the assembled T. chinensis transcriptome was evaluated using BUSCO, which identified 249 (97.7%) complete (197 complete and single-copy and 52 complete and duplicated), 4 fragmented, and 2 missing BUSCOs.        (Figure 1(b)). Notably, 10,509 transcripts were annotated by all these databases (Figure 1(b)). Among the NR mapping results, we found that the top four species hits to the assembled T. chinensis transcriptome were Vitis vinifera (10,313 transcripts), Theobroma cacao (1,980 transcripts), Nelumbo nucifera (1,828 transcripts), and Ziziphus jujuba (1,481 transcripts) (Figure 1(c)). COG annotation (Figure 1(d)) revealed 1,126, 730, 198, and 31 transcripts from "signal transduction mechanism," "energy production and conversion," "defense mechanism," and "cell motility," respectively. Among the 26,276 transcripts with GO annotations (Figure 1(e)), we found that the top four GO terms were "metabolic process" (13,530 transcripts), "cellular process" (12,603 transcripts), "catalytic activity" (10,471 transcripts), and "binding" (9,847 transcripts). Additionally, we identified 1,061 and 192 transcripts related to "signaling" and "signal transducer activity," respectively ( Figure 1(e)). By mapping the T. chinensis transcriptome to the KEGG pathway database, we identified transcripts from five categories (Figure 1(f)), including cellular processes, environmental information processing, genetic information processing, human diseases, metabolism, and organismal systems. Interestingly, metabolism was enriched in most of the transcripts, and 4,896 transcripts were annotated from the "global and overview maps" of metabolisms. We detected 833 and 855 transcripts involved in the pathways of "signal transduction" and "environmental adaption," respectively ( Figure 1(f)).

Scaled value
Using the BLAST results, we identified 38,479 coding sequences (CDSs) from the assembled T. chinensis transcriptome, which consisted of 24.21 million bases (mean length: 629, N50: 927, and GC content: 46.95%). Next, ESTScan identified 3,704 CDSs from the T. chinensis transcriptome, whose size was 1.26 million bases (mean length: 339, N50: 330, and GC content: 50.74%) [31]. Thus, in total, we obtained 42,183 CDSs for the T. chinensis seeds under cold stress (25.46 million bases, mean length: 603, N50: 891, and GC content: 47.14). Furthermore, we predicted 35,268 likely proteins derived from 30,728 of the T. chinensis transcripts using TransDecoder (https://github.com/ TransDecoder/TransDecoder). SignalP and TMHMM identified 1,622 signal peptides and 6,795 transmembrane domains among the likely proteins of T. chinensis seeds. Annotations obtained from different perspectives improved our understanding of the assembled transcriptome and were useful for the identification of cold-responsive genes in the T. chinensis seeds. However, further experiments are required to explore some transcripts that were annotated without encoding capacity.

Gene Expression Profiles of the T. chinensis Seeds under
Cold Stress. The viability of T. chinensis seeds decreased quickly under cold stress, and we investigated the coldresponsive genes of T. chinensis seeds. Read mapping showed that 81.52%-84.95% of the clean reads were aligned  Table 1, and Table S2). Pearson's correlation analysis revealed that the linear correlation between these samples was greater than 0.97, indicating a limited difference between them. The distribution of gene expression revealed that 58.30%-58.68% of detected genes in the T. chinensis seeds were expressed with FPKM values ranging from 10 to 99 (Figure 2(b)). Notably, 64, 69, 74, and 83 T. chinensis genes were expressed at more than 1000 FPKM in A0, A1, A2, and A3, respectively. We next compared the highly expressed genes in all four samples (Table S2), and 9 of the top 10 highly expressed genes were shared by all four samples. Among them, TR50621|c2_g12 was specific to A0, and TR36565|c1_g1 (lipid transfer protein) was specific to A1, A2, and A3. Then, we counted the numbers of some known cold responsive genes. As shown in Table 2, in the T. chinensis seed transcriptome, we identified 18,22,3,9,36,529, 327, 402, 82, and 297 genes related to abscisic acid, aquaporin, C-repeat binding factor (CBF), cold stress (e.g., coldregulated (COR) protein), heat shock protein (HSP), protein kinase (PK), ribosomal protein (RP), transcription factor (TF), zinc finger protein (ZFP), and ubiquitin, respectively.
3.6. Transcription Factors. We next further analyzed the TF expression patterns in the T. chinensis seeds under cold stress. Four hundred two TF genes were annotated in the T. chinensis seed transcriptome (Table 2), and 76 were differentially expressed in response to cold (Table 3). ER TF was the largest (27 genes) class that was differentially expressed in T. chinensis seeds under cold stress (Table 3 and Figure 4(a)). Notably, the expression levels of some ER TF genes were increased in the T. chinensis seeds at the late stage of cold treatment but not in A1 or A2. The downregulation of ER TF genes might be related to the seed development and other seed activities at normal temperature. Similar to ER TF genes, the functions of some other TF classes, such as WRKY and bHLH, are difficult to determine, as we observed both up-and downregulation of these genes in the T. chinensis seeds under cold stress (Figure 4(b)). However, the upregulation of AP2 domain class TF genes and the downregulation of GATA and MYB TF genes might participate in the cold response of the T. chinensis seeds, as they showed consistent regulation at all three time points of the T. chinensis seeds under cold stress (Table 3 and Figure 4(b)).

Protein Kinases, Ribosomal Proteins, and Zinc Finger
Proteins. As shown in Table 2, 60, 11, and 20 genes encoding PK, RP, and ZFP were differentially expressed in the T. chinensis seeds under cold stress, and their expression patterns are presented in Figure 4(c). Although some genes encoding calcium dependent, serine-threonine (e.g., LRR receptor-like and PTI1-like tyrosine) PKs were upregulated in A1 compared to A0; more PK genes from these subtypes were upregulated in A3 (Figure 4(c), left). Interestingly, we identified 402 RP genes in the T. chinensis seed transcriptome, but only 11 of them were deregulated in response to cold stress, including four 40S (e.g., RPS3 and RPS4) and seven 60S (e.g., RPL6, RPL7, and RPL10) RP genes. The overall expression of RP genes was upregulated in T. chinensis seeds to defend against cold stress (Figure 4(c), middle). We also detected both upregulated and downregulated ZFP genes in T. chinensis seeds after cold treatment (Figure 4(c), right). Two major classes of ZFP genes were identified and presented divergent expression patterns: dof ZFP and ZAT. The dof ZFP genes were downregulated, while the ZAT genes were upregulated in T. chinensis seeds under cold stress.
3.8. qRT-PCR. We used qRT-PCR to validate the expression patterns of genes in the T. chinensis seeds under cold stress. Eleven genes (e.g., CBF1, GATA4, MYB44, ERF4, ERF010, and ZAT10) and an internal control (18S rRNA) were selected to perform the qRT-PCR validation experiment. The primers for these candidate genes used in qRT-PCR are listed in Table S1. After ΔΔCt values were calculated, we used log2 relative normalized expression (log2RNE) to show the changes in the expression of target genes between two samples, similar to the transcriptome sequencing. Among the 33 events showing the expression patterns of the target genes in T. chinensis seeds under cold stress, we found that 31 (93.9%) exhibited consistent changes in both RNA-Seq and qRT-PCR experiments ( Figure 5). Notably, the expression patterns of CBF1, SOBIR1, and ERF10 were confirmed by qRT-PCR. The high agreement of gene expression patterns obtained using RNA-Seq and qRT-PCR indicates that the genes identified in this study might function in the cold response of T. chinensis seeds.

Discussion
T. chinensis is a plant that is sensitive to cold, and this study was designed to investigate the changes in gene expression in T. chinensis seeds under cold stress. Due to the lack of genome sequences, we assembled the transcriptome for T. chinensis seeds using RNA-Seq, similar to the studies of other plants, such as Magnolia wufengensis [32], Passiflora edulis Sims [33], Hevea brasiliensis [20], and Rumex patientia [19]. We assembled 98,514 T. chinensis genes and 35,268 likely proteins expressed in the T. chinensis seeds in response to cold. Interestingly, many of the T. chinensis genes were predicted to have similarities with grape genes (Figure 1(c)). A number of cold-responsive genes are involved in multiple biological processes and pathways, such as plant hormone signal transduction [32,33] and metabolic processes [19,20,32,33]. The DEGs identified in the T. chinensis seeds under cold stress were determined to be enriched in these pathways (Figures 2(d)-2(i)). Some known cold-responsive genes, including ABA, aquaporin, CBF, COR, HSP, PK, RP, ZFP ubiquitin, and TFs, were identified ( Table 2). In addition, we found that genes encoding CDC20-1, CDC20-2, YLS9, and EXORDIUM were upregulated in the T. chinensis seeds in response to cold (Table S3).
Cold acclimation temperatures have the potential to induce profound changes in the plant transcriptome. Approximately 4% to 20% of the genes in the Arabidopsis genome are affected by cold [34,35]. Interestingly, different pathways were activated in the T. chinensis seeds during cold treatment (Figures 2(g)-2(i)). The "MAPK signalling pathway-plant" was significantly enriched in the DEGs in A2, but not A1, compared to A0 (Figure 2(h)). This pathway has been reported to modulate plant tolerance to multiple abiotic stresses, such as drought, salt, cold, and heat [36]. The MAPK cascades have been reported to convert the environmental stimuli into cellular responses and to negatively regulate freezing tolerance via the phosphorylation of ICE1, a basic-helix-loop-helix transcription factor that regulates the expression of CBF genes [37], which mediate coldinducible transcription and play a key role in freezing tolerance and cold acclimation by binding to the C-repeat/DRE element [38]. MAPK pathway activation is a rapid response to cold, as MPK3, MPK4, and MPK6 are rapidly activated after cold treatment [37]. Compared to A0, we identified 10, 14, and 21 genes from the MAPK pathway that were differentially expressed in the T. chinensis seeds after exposure to cold for 12 h, 24 h, and 36 h, respectively (Figures 2(h) and 2(i)). Among them, MPK9 was commonly upregulated at all three time points while two MAPKK kinases (NPK1-like and YODA) were specifically upregulated in A3 (Table S3). NPK1-like and its clustered genes, namely, other NPK-like genes (e.g., OsNPKL2, 3, and 4), are induced by cold in rice [39]. YODA was shown to be upstream of MKK4/ MKK5, a negative regulator of freezing tolerance [37], and downstream of the ER receptor in regulating coordinated local cell proliferation, which shapes the morphology of plant organs [40].
Interestingly, the ICE1 TF gene (TR144797|c0_g1) was not dysregulated in T. chinensis seeds under cold stress; however, the CBF1 gene (TR16191|c0_g1) was upregulated during this process (Table S2). The Arabidopsis CBF genes are transcribed within a short period of exposure to cold stress and subsequently induce the COR gene expression [41,42]. We detected two COR genes that were deregulated in the T. chinensis seeds under cold stress: TR39356|c0_g1 (cold-regulated 413 plasma membrane protein 2-like) and TR38199|c0_g1 (cold-regulated gibberellin-regulated protein 1 LTCOR12) (Table S3). In Arabidopsis, CBFs were shown to be negatively regulated by a cold-induced C2H2 zinc finger transcription factor gene, ZAT12 [43], which is also regulated by ICE1 [44], and to be inducers of the C2H2 transcription factor gene ZAT10 [45]. In the present study, we observed the upregulation of ZAT10 and the downregulation of ZAT12 (Table S3) in T. chinensis seeds under cold stress. CBF2, ZAT12, and ZAT10 were shown to regulate 172, 67, and 54 COR genes in Arabidopsis [46]. Thus, the upregulation of CFB and COR genes might be regulated by other modulators, such as MYB15, a negative regulator of CBFs in Arabidopsis [47], and GATA TFs, which were observed to be downregulated in T. chinensis seeds under cold stress (Figure 4(b)).
LRR-RLK (LRR receptor-like serine/threonine-protein kinase) might be another group of PKs participating in the early response to cold stimuli in T. chinensis seeds, as we identified 8 LRR-RLK genes (5, 1, and 5 in A1, A2, and A3, respectively) that were differentially expressed in the T. chinensis seeds under cold stress (Figure 4(c) and Table S3). LRR-RLKs are a well-known class of RLKs that play important roles in plant growth, development, hormone perception, and responses to biotic/abiotic stresses [48]. They have been reported to be positive regulators of cold tolerance in Glycine soja [48], Glycine max [48], and Oryza sativa [49]. We also observed 10 aquaporin genes that were upregulated in the T. chinensis seeds under cold stress ( Table 2 and Table S3). Low environmental temperature has been proven to inhibit water uptake by roots [50]. In chilling-sensitive rice, low temperature treatment resulted in a gradual increase in the expression of aquaporin genes [51]. Li et al. found that the aquaporin gene GhTIP1 was substantially upregulated in cotyledons but downregulated in roots within a few hours after cotton seedlings were exposed to cold [52]. These studies, including the present study, support the hypothesis that aquaporin genes might be involved in the response to cold stress in plants. RP genes might be another class of cold-induced genes in plants that potentially enhances the cold tolerance, as both large and small RP subunits have been shown to be induced in soybean in response to cold [53]. In tobacco, ribosomal proteins Rps2, Rps4, and Rpl20 are essential for cell survival, and Rpl33 is required for sustaining a sufficient plastid translation capacity in cold temperatures [54]. The upregulation of RP genes was also observed in Hippophae rhamnoides under cold and freeze